Complexity and Approximation of the Fuzzy K-Means Problem

نویسندگان

  • Johannes Blömer
  • Sascha Brauer
  • Kathrin Bujna
چکیده

The fuzzy K-means problem is a generalization of the classical K-means problem to soft clusterings, i.e. clusterings where each points belongs to each cluster to some degree. Although popular in practice, prior to this work the fuzzy K-means problem has not been studied from a complexity theoretic or algorithmic perspective. We show that optimal solutions for fuzzy K-means cannot, in general, be expressed by radicals over the input points. Surprisingly, this already holds for very simple inputs in one-dimensional space. Hence, one cannot expect to compute optimal solutions exactly. We give the first (1+ ǫ)-approximation algorithms for the fuzzy K-means problem. First, we present a deterministic approximation algorithm whose runtime is polynomial in N and linear in the dimension D of the input set, given that K is constant, i.e. a polynomial time approximation algorithm given a fixed K. We achieve this result by showing that for each soft clustering there exists a hard clustering with comparable properties. Second, by using techniques known from coreset constructions for the K-means problem, we develop a deterministic approximation algorithm that runs in time almost linear in N but exponential in the dimension D. We complement these results with a randomized algorithm which imposes some natural restrictions on the input set and whose runtime is comparable to some of the most efficient approximation algorithms for Kmeans, i.e. linear in the number of points and the dimension, but exponential in the number of clusters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximation Solutions for Time-Varying Shortest Path Problem

Abstract. Time-varying network optimization problems have tradition-ally been solved by specialized algorithms. These algorithms have NP-complement time complexity. This paper considers the time-varying short-est path problem, in which can be optimally solved in O(T(m + n)) time,where T is a given integer. For this problem with arbitrary waiting times,we propose an approximation algorithm, whic...

متن کامل

ADAPTIVE BACKSTEPPING CONTROL OF UNCERTAIN FRACTIONAL ORDER SYSTEMS BY FUZZY APPROXIMATION APPROACH

In this paper, a novel problem of observer-based adaptive fuzzy fractional control for fractional order dynamic systems with commensurate orders is investigated; the control scheme is constructed by using the backstepping and adaptive technique. Dynamic surface control method is used to avoid the problem of “explosion of complexity” which is caused by backstepping design process. Fuzzy logic sy...

متن کامل

ROUGH SET OVER DUAL-UNIVERSES IN FUZZY APPROXIMATION SPACE

To tackle the problem with inexact, uncertainty and vague knowl- edge, constructive method is utilized to formulate lower and upper approx- imation sets. Rough set model over dual-universes in fuzzy approximation space is constructed. In this paper, we introduce the concept of rough set over dual-universes in fuzzy approximation space by means of cut set. Then, we discuss properties of rough se...

متن کامل

An Approximation Method for Fuzzy Fixed-Charge Transportation Problem

In this paper, we develop the fuzzy fixed charge transportation problem when the costs are the fuzzy numbers. The first step it transform into the classical fuzzy transportation problem. The next, we obtain the best approximation fuzzy on the optimal value of the fuzzy fixed-charge transportation problem. This method obtains a lower and upper bounds both on the fuzzy optimal value of the fuzzy ...

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1512.05947  شماره 

صفحات  -

تاریخ انتشار 2015